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1 . A data management system, comprising: 

a first processor for restoring a plurality of received data files, the data 
files being capable of being different file types; 

a file organizing/categorizing processor, coupled to the first processor, 
for organizing the received data files into data slices, each data slice including 
an identification number and a descriptor that describes characteristics of the 
received data file; 

a file logging processor, coupled to the file organizing/categorizing 
processor, for logging the received data files into a first database based on the 
data slices; 

a data uploading processor, coupled to the file logging processor, for 
uploading the first database to a second database; 

a de-duplicate processor, coupled to the data uploading processor, for 
calculating a SHA value of the received data files to determine whether the 
received data files have duplicates and flagging duplicated data files in the 
second database; 

an image conversion processor, coupled to the de-duphcate processor, 
for converting at least a portion of the received data files into image files; and 

a second processor, coupled to the image conversion processor, for 
exporting the image files. 
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2. The system of claim 1, wherein the first database is a local dat±»ase for at least 
one data slice, and the second database is a global database for all logged data slices. 

3 . The system of claim 1 , wherein the image files converted from the data files 
5 are in a standardized image format. 

4. The system of claim 1 , wherein the data files are in a variety of formats 
including Microsoft Mail, Outlook, GroupWise, Lotus Notes, the user data files have 
a variety of formats including Word, Excel, PowerPoint, and Access. 

10 

5. The system of claim 1, wherein an attachment data file in one of the data files 
is associated with the data file such that image files for the data file and the 
corresponding attachment data file are viewed together. 

15 6. The system of claim 1 , wherein the file logging processor, the image 

conversion processor, and the second processor are parallel processors such that the 
data files are parallel-processed in a data file logging stage, an image conversion 
stage, and an image file output stage. 

20 7. The system of claim 1, wherein the data files having the same file type are 

converted into the image files together. 

8. The system of claim 1, wherein the data management system includes a 
plurality of image conversion processors, each of the image conversion processors 
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being capable of converting the data files having the same file type into the 
corresponding image files. 

9. The system of claim 1 , wherein the file logging processor identifies the file 
5 type of the data files based on the SHA value and a file header of each of the data 

files. 

10. The system of claim 1 , further comprising a keyword search processor, 
coupled to the file logging processor, for searching a keyword fi-om the received data 

10 files, wherein if there is a hit, the corresponding data file is retained for processing, 

and the data file without a hit is discarded without being processed. 

1 1 . The system of claim 1 , further comprising a keyword search processor, 
coupled to the image conversion processor, for searching a keyword from the image 

15 files, wherein if there is a hit, the corresponding image file is exported, and the image 
file without a hit is not exported. 

12. The system of claim 1, further comprising a file status filter to indicate 
different statuses of the received data files. 

20 

13. The system of claim 12, wherein the different statuses comprise New, In- 
Progress, Done, Error, Corrupted, Encrypted, No Keyword Hit, Big File, Large Page 
Count. 
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14. A data management method, comprising the steps of: 

restoring a plurality of received data files, the data files being capable of being 
different file types; 

organizing/categorizing the received data files into data slices, each data slice 
including an identification number and a descriptor that describes characteristics of 
the received data file; 

logging the received data files into a first database based on the data slices; 

uploading the first database to a second database; 

de-duplicating dupUcates in the received data files by calculating a SHA value 
of the received data files to determine whether the received data files have duplicates 
and flagging duplicated data files in the database; 

converting at least a portion of the received data files into image files, 
respectively; and 

exporting the image files. 

15. The method of claim 14, further comprising the step of viewing the image files 
stored in the second database. 

16. The method of claim 14, wherein the step of converting of the data files 
comprises the step of converting the data files into a standardized image format. 

17. The method of claim 14, further comprising the step of searching a keyword 
firom the received data files, if there is a hit, the corresponding data file is retained for 
processing, and the data file without a hit is discarded without being processed. 
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18. The method of claim 14, further comprising the step of searching a keyword 
from the image files, if there is a hit, the corresponding image file is exported, and the 
image file without a hit is not exported. 



